Objective:
To make up for class attendance, I am working on this project regarding Celiac Disease. The goal is to complete background research on Celiac Disease: to understand its cause and current research that has been done toward finding a cure. I will then screen the drug database for compounds that could act effectively on the protein that accounts for the negative reactions to gluten products in Celiac patients. This small molecule compound will hopefully bind to the protein in place of the natural amino acid resides of the gliadin peptide. I will then report my findings.
Background:
Celiac Disease (CD) is also known as celiac sprue or gluten sensitive enteropathy. It is relatively common, especially in the United States where one out of 133 people is affected. This high incidence accounts for the variety of gluten-free food products often found in grocery stores.
CD is a chronic inherited autoimmune condition that results in abnormal reactions to gliadin, which is found in gluten products such as wheat, rye, barley, and triticale. Because it is not caused by an overreaction to allergens seen in allergic reactions, it cannot be classified as such. As a result, unlike people who may grow out of a food allergy, people with Celiac Disease will be affected for their whole life.
Symptoms of the disease are triggered by consumption of gluten. When individuals with this condition eat gluten, their bodies react in a way that causes their immune systems to damage the villi of their small intestines.
Figure 1. Drawing of the digestive system with the small intestine highlighted and the stomach, liver, small intestine, and colon labeled.
Figure 2. Drawing of a section of the small intestine with detail of villi. The small intestine and villi are labeled. The microvilli are the small hair-like appendages on the villi.
These damaged villi and microvilli are subsequently unable to effectively absorb the nutrients they need from the food they eat. Proteins, carbohydrates, fats, vitamins, and minerals cannot be taken up (water and bile salts are sometimes excluded as well, but it depends on the case). As a result, if CD is not treated, it may be life-threatening and increase the individual’s risk of other disorders that may arise from the poor nutrition they receive. These conditions include:
Iron deficiency anemia
Early onset osteoporosis or osteopenia
Vitamin K deficiency associated with risk for hemorrhaging
Vitamin and mineral deficiencies
Central and peripheral nervous system disorders – usually due to unsuspected nutrient deficiencies
Pancreatic insufficiency
Intestinal lymphomas and other GI cancers (malignancies)
Gall bladder malfunction
Neurological manifestations
Although celiac disease is a genetic disease, its onset can occur at any time in life. However, genetic testing alone may not be accurate in diagnosing it. Instead, there are 5 recommended blood tests that may be found outlined on the Celiac Disease Foundation website that are more accurate. Each tests for the presence of a different antibody in the blood after the consumption of gluten. This, along with the genetic testing results and possible symptoms, may provide an accurate diagnosis.
Current treatment:
Because there is currently no drug that has been found to prevent the autoimmune response associated with Celiac Disease, the only effective treatment is a gluten-free diet as even the smallest amounts of it will be damaging. When the diet is freed of gluten products, the small intestine will begin to heal itself and eventually return to normal function. The time period of this process, however, varies by individual and may take up to several years.
Treatments for this condition are being studied. Ideas include enzymes that work to break down and detoxify gluten before it reaches the small intestine. If this succeeds, patients would be able to eat gluten products as long as they also have access to this enzyme or enzyme complex.
Target information:
Celiac Disease is often associated with the presence of HLA-DQ2, which is a serotype group in the HLA-DQ serotyping system. A serotype is a group of cells (such as bacteria or viruses, but includes immune cell types, such as the gliadin of gluten proteins) characterized by a set of antigens (parts of the cell that provoke the production of specific antibodies to fight it in the human body). When these cells enter the body, the immune system responds by generating small molecules called antibodies, or blood proteins that recognize and attach to their corresponding antigens and mark them for elimination. In the case of Celiac Disease, the consumption of gluten, which has HLA-DQ2, produces antibodies and elicits an unnecessary immune response, which leads to damage to the villi in the small intestine.
Figure 3. Crystal structure of HLA-DQ. PBD ID: 1S9V. Structure is shown as sticks, with carbon as green, oxygen as red, nitrogen as blue, and hydrogen hidden.
The gliadin peptide that serves as the antigen that causes the reaction to gluten is found in many different species of the tribe Triticeae. It is resistant to various proteases and peptidases in the intestinal system.
Proposed Treatment:
If a small molecule compound that can bind to the protein in place of the amino acid residues of the gliadin peptide (and with higher affinity), this may prevent the body from recognizing the protein as gluten and overreacting to it. If the compound can prevent an immune response, the villi of the small intestine will not be damaged and Celiac Disease symptoms can be prevented even with the consumption of gluten products. The discovery of a compound that can serve this function will be the goal of this project.
Protocol:
1. The crystal structure of HLA-DQ2 containing the gliadin peptide was found on the PBD. The PBD ID is 1S9V.
2. Chain B of the protein was removed in PyMol.
3. The validity of the new protein was determined with MolProbity: Go to MolProbity website – TIP: use Internet Explorer ‘Choose File’ - Load your PDB file to the website (use the file you have from above instead of entering the PDB identifier directly into MolProbity) Copy down the resolution (to put in your table below) Add Hydrogens, On next page, Choose the option to Add the flips ‘Asn/Gln/His flips’ Leave all the defaults checked and ‘regenerate H’s’ If it asks you to save the file with Hydrogens – skip it for now. Analyze all-atom contacts and geometry Use all defaults >> Run Programs to Perform these Analyses View the Multi-Criterion Chart Save the data for your table below View the Multi-Criterion Kinemage to see what type of errors exist in the active site. Make note of which type they are for your table below. HINT: to see the active site – toggle on and off the ‘hets’ button Include a ‘snip’ of your traffic light table from MolProbity (i.e. green, yellow, red table) Also include these values for the structure in your notebook 1S9V: Together
Values
Percentages
Resolution
2.22Ǻ
Clashscore, all atoms
8.45
Poor rotamers
8
2.38%
Ramachandran outliers
2
0.55%
Ramachandran favored
347
95.59%
Cβ deviations >0.25Å
0
0%
MolProbity score^
2.05
Residues with bad bonds
0/1480
0%
Residues with bad angles
1.1843
0%
Error in Active Site
1
4. Once the protein crystal structure was deemed to be accurate, it was transferred to the DDFE through WINSCP under the folder Celiac1.
5. The ligand library to be screened was selected from /home/chem204/DatabasesVDS/LabVS3_PTP1bLibrary (CB1k_10) and transferred to the Celiac 1 folder as well. 6. The number of ligands in the library was determined by running the countsdf.pl file found in the LabVS3_Library folder with the command perl countsdf.pl.
7. Positive and negative controls were added to the library. Three positive control ligands were found from http://www.bindingdb.org/jsp/dbsearch/PrimarySearch_pubmed.jsp?pubmed=50020709&pubmed_submit=TBD, and are compounds that are known to be inhibitors of enzymes similar to 1S9V.
44437812
Ki value
300,000.0 nM
Molecular Weight
839.934 g/mol
Molecular Formula
C40H57N9O11
XLogP3-AA
-2.7
H-Bond Donor
8
H-Bond Acceptor
12
44437813
Ki value
40,000.0 nM
Molecular Weight
853.961 g/mol
Molecular Formula
C41H59N9O11
XLogP3-AA
-2.3
H-Bond Donor
8
H-Bond Acceptor
12
44437814
Ki value
200,000.0 nM
Molecular Weight
867.987 g/mol
Molecular Formula
C42H61N9O11
XLogP3-AA
-2
H-Bond Donor
8
H-Bond Acceptor
12
8. The identified ligands, as well as a random control aspirin ligand and the extracted original, were downloaded as 2D compounds. They were converted to 3D at a pH = 7.4 on OpenBabel. The resulting compounds were verified through PyMol.
9. A gold.conf program file was generated with the HERMES interface. Connecting to the graphical interface for GOLD Make remote connection to DDFE using a graphical user interface (GUI) for GOLD Open Xming server Go to Start, Programs, Xming, Xming Open Xlaunch
Text Box: Figure 2. GOLD & Hermes
Go to Start, Programs, Xming, XLaunch Select ‘Multiple Windows’ Select ‘Start no client’, Skip next screen by selecting ‘Next’ then ‘Finish’ on next screen Open Putty in Programs Connect to Host Name: ddfe.cm.utexas.edu on Port 22 using SSH On the left side of the window, Select the ‘SSH’ tab and then the ‘X11’ or ‘Tunnels’ tab ‘Enable X11 forwarding’ X display location: leave blank or enter localhost:0 # this is the default display on your computer ‘Open’ Login as user: type your user name for the DDFE (your UTEID) Enter password You must put yourself into the directory where your protein file is. Type ‘ls’ to see the contents and ‘cd’ to change directories Then type this to open gold with the graphical user interface: $gold Ignore the ‘BadFont’ error message, if present Don’t load a Conf file at the top (that is what you will be making here) Step through the Configuration Options to set up your file Skip Wizard Skip Templates Protein > Load protein <nameofyourpdbfile>.pdb Gold 5.0 has a separate window for Global Options and a specific window for operations on your protein. Under the tab for your PDB file name (to the right) – the tab may just say “ID”: Protonation & Tautomers > Add Hydrogens Write down how many hydrogens added. (2253 hydrogens added) Skip - Flip Asn GLn and HIS tautomers. We won’t worry about these right now. Extract/Delete Waters: ‘Delete Remaining Waters’ (don’t select any of them to save). Write down how many waters removed. (89 waters removed) Delete Ligand NOTE: If there is more than one ligand, you will need to go into the Hermes visualizer window to figure out which ligand Go to View >> Protein Explorer Click on the ‘+’ (plus sign) to see the different objects. Extract the Ligand by clicking the ID tab and “delete ligands”, save as ‘LigandExtracted.mol2’ (this will be saved for defining the cavity site) NOTE: you already saved a different PubChem version of this ligand for validation docking Back in WinSCP – make sure your LigandExtracted has an extension If not, then add it to the file (just add .mol2 to the end) Skip the remaining options for the protein. Under Global Options: Define Binding Site –‘Select One or more ligands’ ‘One or more ligands’ - choose the single ligand that you had extracted. ‘Select all atoms within 7.5 Angstroms Leave ‘Generate a cavity’ unchecked Check – ‘Detect cavity’ Check – ‘Force all H bond donors/acceptors ….” – verify active site in image on the Hermes visualizer (only a small region around the ligand of the protein will be highlighted in gray) In the Gold GUI – go back to Global Options Select Ligands – you will need to find where your library is (probably in your Celiac1 directory) e.g. All_pH.sdf This is the file you need to link to for your ligand library. Then make sure the number of conformations per ligand or GA Runs is set to ‘10’ Skip the Reference Ligand Skip ‘Configure Waters’ Skip ‘Ligand Flexibility’ Leave the defaults for ‘Fitness & Search Options’ - it will use CHEMPLP scoring function. ‘GA Settings’ – 10% Output Options Output directory: leave as it is (‘.’) UNCHECK – save ligand rank (.rnk) files UNCHECK – save ligand log files UNCHECK – save initialized ligand files Save solutions to one file: ‘YourTargetvsYourLibraryRun1.sdf’ e.g. “<YOURPROTEINNAME>vsCB1kRun1.sdf” bestranking_list_filename ‘BestYourTargetvsYourLibraryRun1.lst’ e.g. “Best<YOURPROTEINNAME>vsCB1kRun1.lst” Skip ‘Information in File’ and ‘Selecting Solutions” (We will keep all solutions) Skip GoldMine Skip Parallel GOLD – we will run in parallel but it will be executed remotely instead of at this console Skip ‘Constraints’ ‘Atom Typing’ - Automatically set atom and bond types (for the ligand only): Make sure only one box is checked - ‘Ligand’ only At the top of the page hit Save Hit ‘Finish’ to save the file Save GOLD conf file as gold.conf Save protein as <PDBname>protein.mol2 Then close GOLD/Hermes
10. The generated gold.conf file was verified and the autoscale and radius were changed to 1 and 18 A, respectively.
11. The ligands were concatenated with the cat CID_44437812_3DpH.sdf CID_44437813_3DpH.sdf CID_44437814_3DpH.sdf CID_2244aspirin_pH.sdf OriginalPeptidepH.sdf >> LibrarypH.sdf
command.
12. The scriptgoldscanthisjob.sh script file was transferred from the /home/chem204/scripts directory to the Celiac1 folder.
13. The 1005 ligands were screened with 201 processors and the results were concatenated.
Stanford URJ article on Celiac Inhibitors
http://www.stanford.edu/group/journal/cgi-bin/wordpress/wp-content/uploads/2012/09/Cho_NatSci_2005.pdfObjective:
To make up for class attendance, I am working on this project regarding Celiac Disease. The goal is to complete background research on Celiac Disease: to understand its cause and current research that has been done toward finding a cure. I will then screen the drug database for compounds that could act effectively on the protein that accounts for the negative reactions to gluten products in Celiac patients. This small molecule compound will hopefully bind to the protein in place of the natural amino acid resides of the gliadin peptide. I will then report my findings.
Background:
Celiac Disease (CD) is also known as celiac sprue or gluten sensitive enteropathy. It is relatively common, especially in the United States where one out of 133 people is affected. This high incidence accounts for the variety of gluten-free food products often found in grocery stores.
CD is a chronic inherited autoimmune condition that results in abnormal reactions to gliadin, which is found in gluten products such as wheat, rye, barley, and triticale. Because it is not caused by an overreaction to allergens seen in allergic reactions, it cannot be classified as such. As a result, unlike people who may grow out of a food allergy, people with Celiac Disease will be affected for their whole life.
Symptoms of the disease are triggered by consumption of gluten. When individuals with this condition eat gluten, their bodies react in a way that causes their immune systems to damage the villi of their small intestines.
These damaged villi and microvilli are subsequently unable to effectively absorb the nutrients they need from the food they eat. Proteins, carbohydrates, fats, vitamins, and minerals cannot be taken up (water and bile salts are sometimes excluded as well, but it depends on the case). As a result, if CD is not treated, it may be life-threatening and increase the individual’s risk of other disorders that may arise from the poor nutrition they receive. These conditions include:
- Iron deficiency anemia
- Early onset osteoporosis or osteopenia
- Vitamin K deficiency associated with risk for hemorrhaging
- Vitamin and mineral deficiencies
- Central and peripheral nervous system disorders – usually due to unsuspected nutrient deficiencies
- Pancreatic insufficiency
- Intestinal lymphomas and other GI cancers (malignancies)
- Gall bladder malfunction
- Neurological manifestations
Although celiac disease is a genetic disease, its onset can occur at any time in life. However, genetic testing alone may not be accurate in diagnosing it. Instead, there are 5 recommended blood tests that may be found outlined on the Celiac Disease Foundation website that are more accurate. Each tests for the presence of a different antibody in the blood after the consumption of gluten. This, along with the genetic testing results and possible symptoms, may provide an accurate diagnosis.Current treatment:
Because there is currently no drug that has been found to prevent the autoimmune response associated with Celiac Disease, the only effective treatment is a gluten-free diet as even the smallest amounts of it will be damaging. When the diet is freed of gluten products, the small intestine will begin to heal itself and eventually return to normal function. The time period of this process, however, varies by individual and may take up to several years.
Treatments for this condition are being studied. Ideas include enzymes that work to break down and detoxify gluten before it reaches the small intestine. If this succeeds, patients would be able to eat gluten products as long as they also have access to this enzyme or enzyme complex.
Target information:
Celiac Disease is often associated with the presence of HLA-DQ2, which is a serotype group in the HLA-DQ serotyping system. A serotype is a group of cells (such as bacteria or viruses, but includes immune cell types, such as the gliadin of gluten proteins) characterized by a set of antigens (parts of the cell that provoke the production of specific antibodies to fight it in the human body). When these cells enter the body, the immune system responds by generating small molecules called antibodies, or blood proteins that recognize and attach to their corresponding antigens and mark them for elimination. In the case of Celiac Disease, the consumption of gluten, which has HLA-DQ2, produces antibodies and elicits an unnecessary immune response, which leads to damage to the villi in the small intestine.
The gliadin peptide that serves as the antigen that causes the reaction to gluten is found in many different species of the tribe Triticeae. It is resistant to various proteases and peptidases in the intestinal system.
Proposed Treatment:
If a small molecule compound that can bind to the protein in place of the amino acid residues of the gliadin peptide (and with higher affinity), this may prevent the body from recognizing the protein as gluten and overreacting to it. If the compound can prevent an immune response, the villi of the small intestine will not be damaged and Celiac Disease symptoms can be prevented even with the consumption of gluten products. The discovery of a compound that can serve this function will be the goal of this project.
Protocol:
1. The crystal structure of HLA-DQ2 containing the gliadin peptide was found on the PBD. The PBD ID is 1S9V.
2. Chain B of the protein was removed in PyMol.
3. The validity of the new protein was determined with MolProbity:
Go to MolProbity website – TIP: use Internet Explorer
‘Choose File’ - Load your PDB file to the website (use the file you have from above instead of entering the PDB identifier directly into MolProbity)
Copy down the resolution (to put in your table below)
Add Hydrogens,
On next page, Choose the option to Add the flips ‘Asn/Gln/His flips’
Leave all the defaults checked and ‘regenerate H’s’
If it asks you to save the file with Hydrogens – skip it for now.
Analyze all-atom contacts and geometry
Use all defaults >> Run Programs to Perform these Analyses
View the Multi-Criterion Chart
Save the data for your table below
View the Multi-Criterion Kinemage to see what type of errors exist in the active site.
Make note of which type they are for your table below.
HINT: to see the active site – toggle on and off the ‘hets’ button
Include a ‘snip’ of your traffic light table from MolProbity (i.e. green, yellow, red table)
Also include these values for the structure in your notebook
1S9V: Together
4. Once the protein crystal structure was deemed to be accurate, it was transferred to the DDFE through WINSCP under the folder Celiac1.
5. The ligand library to be screened was selected from /home/chem204/DatabasesVDS/LabVS3_PTP1bLibrary (CB1k_10) and transferred to the Celiac 1 folder as well.
6. The number of ligands in the library was determined by running the countsdf.pl file found in the LabVS3_Library folder with the command perl countsdf.pl.
7. Positive and negative controls were added to the library. Three positive control ligands were found from http://www.bindingdb.org/jsp/dbsearch/PrimarySearch_pubmed.jsp?pubmed=50020709&pubmed_submit=TBD, and are compounds that are known to be inhibitors of enzymes similar to 1S9V.
9. A gold.conf program file was generated with the HERMES interface.
Connecting to the graphical interface for GOLD
Make remote connection to DDFE using a graphical user interface (GUI) for GOLD
Open Xming server
Go to Start, Programs, Xming, Xming
Open Xlaunch
Select ‘Multiple Windows’
Select ‘Start no client’,
Skip next screen by selecting ‘Next’ then ‘Finish’ on next screen
Open Putty in Programs
Connect to Host Name: ddfe.cm.utexas.edu on Port 22 using SSH
On the left side of the window, Select the ‘SSH’ tab and then the ‘X11’ or ‘Tunnels’ tab
‘Enable X11 forwarding’
X display location: leave blank or enter localhost:0
# this is the default display on your computer
‘Open’
Login as user: type your user name for the DDFE (your UTEID)
Enter password
You must put yourself into the directory where your protein file is.
Type ‘ls’ to see the contents and ‘cd’ to change directories
Then type this to open gold with the graphical user interface:
$gold
Ignore the ‘BadFont’ error message, if present
Don’t load a Conf file at the top (that is what you will be making here)
Step through the Configuration Options to set up your file
Skip Wizard
Skip Templates
Protein > Load protein <nameofyourpdbfile>.pdb
Gold 5.0 has a separate window for Global Options and a specific window for operations on your protein.
Under the tab for your PDB file name (to the right) – the tab may just say “ID”:
Protonation & Tautomers > Add Hydrogens
Write down how many hydrogens added. (2253 hydrogens added)
Skip - Flip Asn GLn and HIS tautomers. We won’t worry about these right now.
Extract/Delete Waters: ‘Delete Remaining Waters’ (don’t select any of them to save).
Write down how many waters removed. (89 waters removed)
Delete Ligand
NOTE: If there is more than one ligand, you will need to go into the Hermes visualizer window to figure out which ligand
Go to View >> Protein Explorer
Click on the ‘+’ (plus sign) to see the different objects.
Extract the Ligand by clicking the ID tab and “delete ligands”, save as ‘LigandExtracted.mol2’
(this will be saved for defining the cavity site)
NOTE: you already saved a different PubChem version of this ligand for validation docking
Back in WinSCP – make sure your LigandExtracted has an extension
If not, then add it to the file (just add .mol2 to the end)
Skip the remaining options for the protein.
Under Global Options:
Define Binding Site –‘Select One or more ligands’
‘One or more ligands’ - choose the single ligand that you had extracted.
‘Select all atoms within 7.5 Angstroms
Leave ‘Generate a cavity’ unchecked
Check – ‘Detect cavity’
Check – ‘Force all H bond donors/acceptors ….”
– verify active site in image on the Hermes visualizer
(only a small region around the ligand of the protein will be highlighted in gray)
In the Gold GUI – go back to Global Options
Select Ligands – you will need to find where your library is (probably in your Celiac1 directory)
e.g. All_pH.sdf
This is the file you need to link to for your ligand library.
Then make sure the number of conformations per ligand or GA Runs is set to ‘10’
Skip the Reference Ligand
Skip ‘Configure Waters’
Skip ‘Ligand Flexibility’
Leave the defaults for ‘Fitness & Search Options’ - it will use CHEMPLP scoring function.
‘GA Settings’ – 10%
Output Options
Output directory: leave as it is (‘.’)
UNCHECK – save ligand rank (.rnk) files
UNCHECK – save ligand log files
UNCHECK – save initialized ligand files
Save solutions to one file:
‘YourTargetvsYourLibraryRun1.sdf’ e.g. “<YOURPROTEINNAME>vsCB1kRun1.sdf”
bestranking_list_filename
‘BestYourTargetvsYourLibraryRun1.lst’ e.g. “Best<YOURPROTEINNAME>vsCB1kRun1.lst”
Skip ‘Information in File’ and ‘Selecting Solutions” (We will keep all solutions)
Skip GoldMine
Skip Parallel GOLD – we will run in parallel but it will be executed remotely instead of at this console
Skip ‘Constraints’
‘Atom Typing’ - Automatically set atom and bond types (for the ligand only):
Make sure only one box is checked - ‘Ligand’ only
At the top of the page hit Save
Hit ‘Finish’ to save the file
Save GOLD conf file as gold.conf
Save protein as <PDBname>protein.mol2
Then close GOLD/Hermes
10. The generated gold.conf file was verified and the autoscale and radius were changed to 1 and 18 A, respectively.
11. The ligands were concatenated with the
cat CID_44437812_3DpH.sdf CID_44437813_3DpH.sdf CID_44437814_3DpH.sdf CID_2244aspirin_pH.sdf OriginalPeptidepH.sdf >> LibrarypH.sdf
command.
12. The scriptgoldscanthisjob.sh script file was transferred from the /home/chem204/scripts directory to the Celiac1 folder.
13. The 1005 ligands were screened with 201 processors and the results were concatenated.